Importing database with phenotypic values

The database used here is composed of 39 Assaf ewes in the first lactation which had the estrus synchronized and the lambing performed in a short window in order to avoid variations caused by the lactation stage. These animals were housed in individual pens and were milked twice a day in a 1x10 stall milking parlor (DeLaval). In the first stage, these animals were fed ad libitum a total mixed ratio (TMR) formulated from alfalfa hay (particle size > 4 cm) and concentrates (50:50 forage:concentrate ratio). These ewes had their dry matter intake (DMI) and milk yield measured daily for a period of three weeks after a period of three weeks of adaptation to the TMR. For each animal, the feed intake was estimated daily by weighting the dry matter refused. Additionally, morning and evening milk production was weighted for each animal in order to calculate the daily milk yield. Protein, fat, and lactose content were estimated for each animal as described by Barrio et al. (2022). The changes in the body weight (BW) were calculated by each ewe through the recording of two consecutive days at the beginning and at the end of the FE evaluation.

#Replace the pathway by the folder in your computer

FE.db<-read.table("~/post_doc_Leon/projetos/RFI_sheep_R_calculations/databases_FE/FE_AllMetrics_database_sheep_07_02_23.txt", h=T, sep="\t")

head(FE.db)
##   Animal_ID meanBW BWC1 days_period  Fat Protein Milk intake energyTMR DIM
## 1     61472   66.9 -1.0          31 66.2    49.1 2.63  2.610      0.93  23
## 2     61473   73.4  0.0          31 59.8    46.1 2.55  2.714      0.93  29
## 3     61474   66.1  0.6          31 55.3    49.8 1.53  2.462      0.93  32
## 4     61477   54.4 -6.0          31 71.7    50.6 2.18  2.307      0.93  31
## 5     61480   66.8  0.0          31 55.4    49.6 2.63  2.985      0.93  50
## 6     61483   77.9  0.0          31 52.8    49.6 3.04  3.156      0.93  28

The following columns are observed in the database:

Animal_ID: Animal identification

meanBW: Average body weight (kg)

BWC1: Body weight change during the experimental period

days_period: Days in the experiment

Fat: Fat yeild (g/kg)

Protein: Protein yield (g/kg)

Milk: Milk yield (kg/d)

intake: Actual dry matter intake

energyTMR: Energy content of the total mixed ratio (TMR)

DIM: Days in milk

Importing required libraries

library(ggplot2)
library(PerformanceAnalytics)
library(plotly)
library(patchwork)

Estimating additional metrics for FEI, FCR and RFI calculations

In order to calculate the FE metrics, some additional values must be estimated.

Body weight change during the period (BWC2)

#Defined as the body weight change during the whole period divided by the days 
#in the period

FE.db$BWC2<-FE.db$BWC1/FE.db$days_period

Metabolic body weight

#Defined as the body weight raised to 0.75

FE.db$MBW<-FE.db$meanBW^0.75

Energy requirements for BW change, UFL/d

#Defined based on INRA equations

FE.db$ER_BWC<-(0.28*(FE.db$BWC2*1000))/100

Energy requirements for maintenance, UFL/d

#Defined based on INRA equations

FE.db$ER_main<-(0.0345*FE.db$MBW)

Mean milk yield, L/d (MY)

FE.db$MY<-FE.db$Milk*1.036

Fat yield, L/d (MY)

FE.db$FY<-FE.db$Fat*1.036

Protein yield, L/d (MY)

FE.db$PY<-FE.db$Protein*1.036

Metabolic body weight change per day

#Defined as the absolute BWC1 raised to 0.75

FE.db$MBWC1<-abs(FE.db$BWC1)^0.75

Metabolic body weight change for the experimental period

#Defined as the absolute BWC2 raised to 0.75

FE.db$MBWC2<-abs(FE.db$BWC2)^0.75

Energy corrected milk (ECM, kg/d) - INRA 2018

#Defined based on INRA equations

FE.db$ECM<-FE.db$MY*(0.0071*FE.db$FY+0.0043*FE.db$PY+0.2224)

Energy requirements for milk yield, UFL/d

#Defined based on INRA equations

FE.db$ER_MY<-0.686*FE.db$ECM

Total energy requirements, UFL/d

#Defined as the sum of the Energy requirements for MY, maintenance, and BWC

FE.db$TER<-rowSums(FE.db[,c("ER_MY","ER_main","ER_BWC")])

Predicted DM intake

#Defined as the ratio between total energy requirements and the energy on TMR

FE.db$Pred_DMI<-FE.db$TER/FE.db$energyTMR

Checking new database

head(FE.db)
##   Animal_ID meanBW BWC1 days_period  Fat Protein Milk intake energyTMR DIM
## 1     61472   66.9 -1.0          31 66.2    49.1 2.63  2.610      0.93  23
## 2     61473   73.4  0.0          31 59.8    46.1 2.55  2.714      0.93  29
## 3     61474   66.1  0.6          31 55.3    49.8 1.53  2.462      0.93  32
## 4     61477   54.4 -6.0          31 71.7    50.6 2.18  2.307      0.93  31
## 5     61480   66.8  0.0          31 55.4    49.6 2.63  2.985      0.93  50
## 6     61483   77.9  0.0          31 52.8    49.6 3.04  3.156      0.93  28
##          BWC2      MBW      ER_BWC   ER_main      MY      FY      PY     MBWC1
## 1 -0.03225806 23.39212 -0.09032258 0.8070281 2.72468 68.5832 50.8676 1.0000000
## 2  0.00000000 25.07680  0.00000000 0.8651495 2.64180 61.9528 47.7596 0.0000000
## 3  0.01935484 23.18201  0.05419355 0.7997794 1.58508 57.2908 51.5928 0.6817316
## 4 -0.19354839 20.03084 -0.54193548 0.6910640 2.25848 74.2812 52.4216 3.8336586
## 5  0.00000000 23.36589  0.00000000 0.8061232 2.72468 57.3944 51.3856 0.0000000
## 6  0.00000000 26.22123  0.00000000 0.9046325 3.14944 54.7008 51.3856 0.0000000
##        MBWC2      ECM     ER_MY      TER Pred_DMI
## 1 0.07611649 2.528698 1.7346865 2.451392 2.635905
## 2 0.00000000 2.292108 1.5723861 2.437536 2.621006
## 3 0.05189102 1.348925 0.9253624 1.779335 1.913264
## 4 0.29180462 2.202491 1.5109088 1.660037 1.784986
## 5 0.00000000 2.318317 1.5903652 2.396488 2.576869
## 6 0.00000000 2.619496 1.7969740 2.701606 2.904953

Estimating FEI and FCR

After the estimation of the additional metrics above mentioned, the estimation of FCR and FEI it is a straightforward approach.

Feed efficiency index (FEI)

#FEI is defined as Actual dry matter intake subtracted by the predicted dry matter intake

FE.db$FEI<-FE.db$intake-FE.db$Pred_DMI
#Checking distribution of FEI values

ggplot(FE.db, aes(x=FEI)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking correlation between FEI and actual intake values

cor_FEI_DMI<-round(cor(FE.db$intake, FE.db$FEI),2)

text_FEI<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(FE.db$intake,2), "\n" ,
                "FEI: ", round(FE.db$FEI,2),sep="")

FEI_plot<-ggplot(FE.db, aes(y=intake,x=FEI, text=text_FEI)) + 
  geom_point() + 
  custom_theme() + scale_y_continuous(name="DMI") +
  ggtitle(paste("Actual DMI vs FEI (r2= ", cor_FEI_DMI,")",sep=""))

ggplotly(FEI_plot, tooltip = "text")

Feed Conversion Ratio (FCR)

#FCR is defined as Feed intake (kg DM/d) / ECM (kg/d), where ECM is estimated based on the ECM equation from INRA 2018 (for sheep) 

FE.db$FCR<-FE.db$intake/FE.db$ECM
#Checking distribution of FCR values

ggplot(FE.db, aes(x=FCR)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking correlation between FCR and actual intake values

cor_FCR_DMI<-round(cor(FE.db$intake, FE.db$FCR),2)

text_FCR<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(FE.db$intake,2), "\n" ,
                "FCR: ", round(FE.db$FCR,2),sep="")

FCR_plot<-ggplot(FE.db, aes(y=intake,x=FCR, text=text_FCR)) + 
  geom_point() + 
  custom_theme() + scale_y_continuous(name="DMI") +
  ggtitle(paste("Actual DMI vs FCR (r2= ", cor_FCR_DMI,")",sep=""))

ggplotly(FCR_plot, tooltip = "text")

Estimating Residual feed intake (RFI)

The RFI is defined as the subtraction of the actual feed intake by the predicted feed intake. The prediction of feed intake can be performed using different models. Here, for models to predict the feed intake and subsequently the RFI will be shown.

RFI estimation based on the model proposed by Pryce et al. 2015

#A linear model including ECM, MBW, BWC1 and DIM is used to predict the feed intake

model.pryce<-lm(intake ~ ECM + MBW + BWC1 + DIM,data=FE.db)

#Checking model output

summary(model.pryce)
## 
## Call:
## lm(formula = intake ~ ECM + MBW + BWC1 + DIM, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35948 -0.08996 -0.00117  0.09014  0.26824 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept) 0.160859   0.348379   0.462  0.64721    
## ECM         0.445107   0.051462   8.649 4.18e-10 ***
## MBW         0.061710   0.012842   4.805 3.07e-05 ***
## BWC1        0.058145   0.017115   3.397  0.00175 ** 
## DIM         0.004811   0.003201   1.503  0.14205    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1382 on 34 degrees of freedom
## Multiple R-squared:  0.8089, Adjusted R-squared:  0.7864 
## F-statistic: 35.97 on 4 and 34 DF,  p-value: 8.943e-12
#Extracting residuals (RFI)
FE.db$RFI_pryce<-residuals(model.pryce)

head(FE.db$RFI_pryce)
## [1] -0.17243875 -0.15410524  0.08131775  0.12941603  0.10977034  0.07636076
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI_pryce)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking the correlation between predicted and actual values

r2_pryce<-round(summary(model.pryce)$adj.r.squared,3)

RFI_pryce_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.pryce))

text_pryce<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_pryce_table$DMI,2), "\n" ,
                "RFI_pryce: ", round(RFI_pryce_table$predicted_DMI,2),sep="")

plot_pryce<-ggplot(RFI_pryce_table, aes(y=DMI,x=predicted_DMI, text=text_pryce)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by pryce (r2= ", r2_pryce,")",sep=""))

ggplotly(plot_pryce, tooltip = "text")

RFI estimation based on ECM, MBW and MBWC2

#A linear model including ECM, MBW and MBWC2 is used to predict the feed intake

model.RFI2<-lm(intake ~ ECM + MBW + MBWC2,data=FE.db)

#Checking model output

summary(model.RFI2)
## 
## Call:
## lm(formula = intake ~ ECM + MBW + MBWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.52576 -0.05559  0.00190  0.09460  0.31689 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  0.40482    0.34894   1.160    0.254    
## ECM          0.38148    0.06470   5.896 1.06e-06 ***
## MBW          0.06809    0.01490   4.571 5.83e-05 ***
## MBWC2       -0.16276    0.42234  -0.385    0.702    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1757 on 35 degrees of freedom
## Multiple R-squared:  0.6818, Adjusted R-squared:  0.6545 
## F-statistic: 24.99 on 3 and 35 DF,  p-value: 8.006e-09
#Extracting residuals (RFI)
FE.db$RFI2<-residuals(model.RFI2)

head(FE.db$RFI2)
## [1] -0.33978344 -0.27262280 -0.02736055 -0.25437641  0.10486990 -0.03343619
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI2)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI2<-round(summary(model.RFI2)$adj.r.squared,3)

RFI_RFI2_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI2))

text_RFI2<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI2_table$DMI,2), "\n" ,
                "RFI2: ", round(RFI_RFI2_table$predicted_DMI,2),sep="")

plot_RFI2<-ggplot(RFI_RFI2_table, aes(y=DMI,x=predicted_DMI, text=text_RFI2)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by RFI2 (r2= ", r2_RFI2,")",sep=""))

ggplotly(plot_RFI2, tooltip = "text")

RFI estimation based on ECM, MBW and an interaction between MeanBW and BWC2

#A linear model including ECM, MBW and an interaction term between meanBW and BWC2 is used to predict the feed intake

model.RFI3<-lm(intake ~ ECM + MBW + meanBW*BWC2,data=FE.db)

#Checking model output

summary(model.RFI3)
## 
## Call:
## lm(formula = intake ~ ECM + MBW + meanBW * BWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.33958 -0.09750  0.01234  0.07692  0.27795 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## (Intercept)  1.64677    3.93215   0.419    0.678    
## ECM          0.46228    0.05523   8.370 1.14e-09 ***
## MBW         -0.14905    0.69816  -0.213    0.832    
## meanBW       0.05384    0.18580   0.290    0.774    
## BWC2        -0.59445    3.83057  -0.155    0.878    
## meanBW:BWC2  0.04488    0.06186   0.725    0.473    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1432 on 33 degrees of freedom
## Multiple R-squared:  0.8009, Adjusted R-squared:  0.7707 
## F-statistic: 26.54 on 5 and 33 DF,  p-value: 1.124e-10
#Extracting residuals (RFI)
FE.db$RFI3<-residuals(model.RFI3)

head(FE.db$RFI3)
## [1] -0.24343042 -0.20660344  0.04212459  0.05618714  0.15262329  0.01234396
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI3)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI3<-round(summary(model.RFI3)$adj.r.squared,3)

RFI_RFI3_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI3))

text_RFI3<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI3_table$DMI,2), "\n" ,
                "RFI3: ", round(RFI_RFI3_table$predicted_DMI,2),sep="")

plot_RFI3<-ggplot(RFI_RFI3_table, aes(y=DMI,x=predicted_DMI, text=text_RFI3)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by pryce (r2= ", r2_RFI3,")",sep=""))

ggplotly(plot_RFI3, tooltip = "text")

Estimating correlation between FE metrics

The Pearson correlation across the FE variables estimated here will be calculated and plotted using the R package PerformanceAnalytics. It is possible to note that, in general, a strong correlation is observed between the variables.

chart.Correlation(FE.db[,c("FEI","FCR", "RFI_pryce", "RFI2", "RFI3")], histogram=TRUE, pch=19)

Additional comment

If we check the summary of the models used here to estimate RFI, we will notice that the intercept is not significant for any model. Consequently, what would happen if we exclude the intercept from our models? But, even more important… Should we remove the intercept from our model in this case?

RFI estimation based on the model proposed by Pryce et al. 2015

#A linear model including ECM, MBW, BWC1 and DIM is used to predict the feed intake

model.pryce_NoInt<-lm(intake ~ 0 + ECM + MBW + BWC1 + DIM,data=FE.db)

#Checking model output

summary(model.pryce_NoInt)
## 
## Call:
## lm(formula = intake ~ 0 + ECM + MBW + BWC1 + DIM, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.36075 -0.09194  0.00215  0.09168  0.27547 
## 
## Coefficients:
##      Estimate Std. Error t value Pr(>|t|)    
## ECM  0.445353   0.050877   8.753 2.45e-10 ***
## MBW  0.066934   0.006006  11.145 4.62e-13 ***
## BWC1 0.055461   0.015916   3.485  0.00134 ** 
## DIM  0.005738   0.002465   2.328  0.02582 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1366 on 35 degrees of freedom
## Multiple R-squared:  0.9979, Adjusted R-squared:  0.9977 
## F-statistic:  4154 on 4 and 35 DF,  p-value: < 2.2e-16
#Extracting residuals (RFI)
FE.db$RFI_pryce_NoInt<-residuals(model.pryce_NoInt)

head(FE.db$RFI_pryce_NoInt)
## [1] -0.15841558 -0.15170324  0.09268018  0.14024440  0.10163771  0.07363027
#Checking distribution of RFI_pryce_NoInt values

ggplot(FE.db, aes(x=RFI_pryce_NoInt)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme() 

#Checking the correlation between predicted and actual values

r2_pryce_NoInt<-round(summary(model.pryce)$adj.r.squared,3)

RFI_pryce_NoInt_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.pryce))

text_pryce_NoInt<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_pryce_NoInt_table$DMI,2), "\n" ,
                "RFI_pryce_NoInt: ", 
                round(RFI_pryce_NoInt_table$predicted_DMI,2),sep="")

plot_pryce_NoInt<-ggplot(RFI_pryce_NoInt_table, aes(y=DMI,x=predicted_DMI, text=text_pryce_NoInt)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by pryce (r2= ", r2_pryce_NoInt,")",sep=""))

ggplotly(plot_pryce_NoInt, tooltip = "text")

RFI estimation based on ECM, MBW and MBWC2

#A linear model including ECM, MBW and MBWC2 is used to predict the feed intake

model.RFI2_NoInt<-lm(intake ~ 0 + ECM + MBW + MBWC2,data=FE.db)

#Checking model output

summary(model.RFI2_NoInt)
## 
## Call:
## lm(formula = intake ~ 0 + ECM + MBW + MBWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.53479 -0.06999  0.01566  0.08788  0.33739 
## 
## Coefficients:
##        Estimate Std. Error t value Pr(>|t|)    
## ECM    0.400934   0.062785   6.386 2.13e-07 ***
## MBW    0.083561   0.006668  12.532 1.08e-14 ***
## MBWC2 -0.078715   0.418074  -0.188    0.852    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1766 on 36 degrees of freedom
## Multiple R-squared:  0.9964, Adjusted R-squared:  0.9961 
## F-statistic:  3310 on 3 and 36 DF,  p-value: < 2.2e-16
#Extracting residuals (RFI)
FE.db$RFI2_NoInt<-residuals(model.RFI2_NoInt)

head(FE.db$RFI2_NoInt)
## [1] -0.35250696 -0.30041399 -0.01184591 -0.22687153  0.10304216 -0.08530448
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI2_NoInt)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI2_NoInt<-round(summary(model.RFI2_NoInt)$adj.r.squared,3)

RFI_RFI2_NoInt_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI2_NoInt))

text_RFI2_NoInt<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI2_NoInt_table$DMI,2), "\n" ,
                "RFI2_NoInt: ", 
                round(RFI_RFI2_NoInt_table$predicted_DMI,2),sep="")

plot_RFI2_NoInt<-ggplot(RFI_RFI2_NoInt_table, aes(y=DMI,x=predicted_DMI, text=text_RFI2_NoInt)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by RFI2 (r2= ", r2_RFI2_NoInt,")",sep=""))

ggplotly(plot_RFI2_NoInt, tooltip = "text")

RFI estimation based on ECM, MBW and an interaction between MeanBW and BWC2

#A linear model including ECM, MBW and an interaction term between meanBW and BWC2 is used to predict the feed intake

model.RFI3_NoInt<-lm(intake ~ 0 + ECM + MBW + meanBW*BWC2,data=FE.db)

#Checking model output

summary(model.RFI3_NoInt)
## 
## Call:
## lm(formula = intake ~ 0 + ECM + MBW + meanBW * BWC2, data = FE.db)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.34223 -0.08871  0.01206  0.07004  0.27763 
## 
## Coefficients:
##             Estimate Std. Error t value Pr(>|t|)    
## ECM          0.46933    0.05196   9.033 1.48e-10 ***
## MBW          0.14288    0.03856   3.706 0.000746 ***
## meanBW      -0.02377    0.01317  -1.804 0.080068 .  
## BWC2        -1.01090    3.65411  -0.277 0.783726    
## meanBW:BWC2  0.05158    0.05903   0.874 0.388345    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1414 on 34 degrees of freedom
## Multiple R-squared:  0.9978, Adjusted R-squared:  0.9975 
## F-statistic:  3101 on 5 and 34 DF,  p-value: < 2.2e-16
#Extracting residuals (RFI)
FE.db$RFI3_NoInt<-residuals(model.RFI3_NoInt)

head(FE.db$RFI3_NoInt)
## [1] -0.25011614 -0.19998264  0.04146612  0.05185153  0.14628789  0.03181152
#Checking distribution of RFI_pryce values

ggplot(FE.db, aes(x=RFI3_NoInt)) + 
  geom_histogram(color="black", fill="white") + 
  custom_theme()

#Checking the correlation between predicted and actual values

r2_RFI3_NoInt<-round(summary(model.RFI3_NoInt)$adj.r.squared,3)

RFI_RFI3_NoInt_table<-data.frame(DMI=FE.db$intake, predicted_DMI=predict(model.RFI3_NoInt))

text_RFI3_NoInt<-paste("Animal ID: ", FE.db$Animal_ID, "\n","DMI: ", 
                round(RFI_RFI3_NoInt_table$DMI,2), "\n" ,
                "RFI3_NoInt: ", 
                round(RFI_RFI3_NoInt_table$predicted_DMI,2),sep="")

plot_RFI3_NoInt<-ggplot(RFI_RFI3_NoInt_table, aes(y=DMI,x=predicted_DMI, text=text_RFI3_NoInt)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle(paste("Actual DMI vs predicted DMI by RFI3 (r2= ", r2_RFI3_NoInt,")",sep=""))

ggplotly(plot_RFI3_NoInt, tooltip = "text")

It is posisble to note that much higher adjusted R-squares were obtained for all the models. However, should we trust on those values?

Problems when you remove the intercept

Let’s use the models with higher R-adjusted with intercept and its version without intercept as example. First, we should check the distribution of the residuals obtained in both models.

# Checking the mean and standard deviation of the residuals obtained in each model

table.res<-data.frame(res_pryce=residuals(model.pryce), res_pryce_NoInt=residuals(model.pryce_NoInt))

head(table.res)
##     res_pryce res_pryce_NoInt
## 1 -0.17243875     -0.15841558
## 2 -0.15410524     -0.15170324
## 3  0.08131775      0.09268018
## 4  0.12941603      0.14024440
## 5  0.10977034      0.10163771
## 6  0.07636076      0.07363027
Mean and standard deviation of the residuals obtained in the models based on Pryce et al. (2015) with and without intercept.
Mean SD
RFI Pryce 0e+00 0.1307
RFI Pryce without intercept 6e-04 0.1311
res.pryce<-ggplot(model.pryce, aes(y=.resid,x=.fitted)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle("RFI Pryce et al. (2015)") + 
  custom_theme() + scale_y_continuous(name="Residuals") +
  scale_x_continuous(name="Fitted values") 

res.pryce.NoInt<-ggplot(model.pryce_NoInt, aes(y=.resid,x=.fitted)) + 
  geom_point() + 
  custom_theme() + 
  ggtitle("RFI Pryce et al. (2015) without intercept") + 
  custom_theme() + scale_y_continuous(name="Residuals") +
  scale_x_continuous(name="Fitted values") 
  
res.pryce | res.pryce.NoInt

Conclusion about intercept removal

As we can note, the SD obtained for the residuals in both models are quite similar. However, the mean of the residuals in the model without intercept is not equal zero, which is a requirement for the kind of model used here. Additionally, when we remove the intercept, we force the program to set the this value to be equal zero. Consequently, assuming that our the intercepts of our regression lines pass through the zero, which is not the case of most of the models fitted with real data (in the case of a continuous variable). In summary, to run a model without intercept, in the current case, result in an inflation of the sums of squares (SS) for the model (SSmodel) and residuals (SSresidual), leading to the increase in the R-square values.